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Abstract — In compressive sensing, tlie basis pursuit algoritlim 
aims to find tlie sparsest solution to an underdetermined linear 
equation system. In this paper, we generalize basis pursuit to 
finding the sparsest solution to higher order nonlinear systems 
of equations, called nonlinear basis pursuit. In contrast to the ex- 
isting nonlinear compressive sensing methods, the new algorithm 
that solves the nonlinear basis pursuit problem is convex and 
not greedy. The novel algorithm enables the compressive sensing 
approach to be used for a broader range of applications where 
there are nonlinear relationships between the measurements and 
the unknowns. 



I. Introduction 

Consider the problem of finding the sparsest solution x to 
a linear system of equations: 
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where y, and bi are presumed given. This problem has received 
considerable attention recently in the area of compressive 
sensing. The exact solution to ([TJ is known as £o-minimization 
(4-min): 
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(2a) 
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It is well known that solving ^o-min is an NP-hard problem. 
Hence, it is not practical to directly solve the problem except 
when the system dimension n is very small. 

In the literature, algorithms for finding an approximate 
solution to ^ can be divided into two categories. The first 
category includes greedy algorithms ifTSl . Il28l . 04|- Because 
these greedy algorithms iteratively update their estimate of x 
based on local information, their computational complexity 
in the processing of computing the estimation updates is 
significantly lower than that of ^Q-min. However, their main 
disadvantage is a weaker guarantee of convergence, especially 
comparing to the second category below. 

The second category of algorithms solves a convex relaxed 
problem known as ^i-minimization or basis pursuit lfT3l : 
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Compared to greedy algorithms, basis pursuit provably recov- 
ers the exact solution as ^o-min under some mild conditions 
as described in compressive sensing theory lfT6l . lH), Q. 
As the nonconvex £o-norm function is replaced by a convex 
£i-norm function, basis pursuit can be solved by convex 
optimization. In sparse optimization literature, there have been 
extensive discussions about accelerating the implementation of 
basis pursuit. The interested reader is referred to |24|, |39|. 
For further readings on greedy algorithms, basis pursuit, and 
compressive sensing, see ITtI . 

More recently, compressive sensing theory has been ex- 
tended to solving nonlinear problem, which is called nonlinear 
compressive sensing (NLCS) 0], |[T]: 
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where fi{x) in general can be considered a smooth function. 

We are particularly interested in solving Q, as it can 
extend the compressive sensing approach to a broad range of 
applications where the relationship between the measurements 
and the unknowns is nonlinear. For example, the problem of 
quadratic basis pursuit (QBP) ll32l . (31], ll30l is a special case 
of the NLCS formulation, which is a fundamental solution to 
the compressive phase retrieval problem in the applications of 
diffraction imaging ll27l and sub-wavelength imaging |36|. 

Similar to solving if)-mm, directly solving Q in its original 
form is very expensive and intractable in practice. Therefore, 
there is a need to seek more efficient numerical algorithms 
to estimate sparse solutions in Q while their convergence 
can be also guaranteed. As the topic of nonlinear compressive 
sensing is rather new, there are just a few algorithms that have 
been proposed in the literature. For instance, one of the first 
papers discussing this problem is |5 1, which proposed a greedy 
gradient-based algorithm. Another greedy approach was also 
proposed in i23l . In |[T|, the authors proposed several iterative 
hard-thresholding and sparse simplex pursuit algorithms. As 
these algorithms are nonconvex greedy solutions, the analysis 
of their convergence typically only concerns about their local 
behavior. In |4|, the author also considered a generalization 
of the restricted isometry property (RIP) to support the use 
of similar iterative hard-thresholding algorithms for solving 
general NLCS problems. 

In this paper, we present a novel solution to NLCS, 
called nonlinear basis pursuit (NLBP). The work extends our 
previous publications in quadratic compressive sensing and 
compressive phase retrieval ||3TI . ||32]| . and proposes a convex 
algorithm to solve NLBP via a high-order Taylor expansion 
and a lifting technique. The work was inspired by several 
recent works on CS applied to the phase retrieval problem 

ma, ma, na, mgi, ini, in, isa, eu, im, ei, ma. 



The paper is organized as follows. We will first discuss some 
basic notation for this paper in Sectionjll] Then we will present 
the NLBP problem and its analysis of theoretical convergence 
in Section III In Section IVJ we will present a numerical 
efficient convex solver to estimate the sparse solution of NLBP. 



Finally, in Section VI we will present numerical evaluation to 
validate the performance of our algorithm and compare with 
other existing solutions. 

II. Notation and Assumptions 

In this paper, we use bold face to denote vectors and 
matrices, and normal font for scalars. We denote the trans- 
pose of a real vector by x^ and the conjugate transpose 
of a complex vector by x^. X{i,j) denotes the (i,j)th 
element. Similarly, we let x{i) be the ith element of the 
vector X. Given two matrices X and Y, we use the fol- 
lowing fact that their product in the trace function com- 
mutes, namely, Tr{XY) = Ti-{YX), under the assump- 
tion that the dimensions match. || • ||o counts the num- 
ber of nonzero elements in a vector or matrix; similarly, 
II • 111 denotes the element-wise ^i-norm of a vector or matrix, 
i.e., , the sum of the magnitudes of the elements; whereas || • | 
represents the ^2 -norm for vectors and the spectral norm for 
matrices. We assume that the functions fi{-),i = 1,. . . ,N are 
analytic functions. 

III. Nonlinear Basis Pursuit 

Since fi{-),i = 1,...,A^, are analytic functions, we can 
express them using their Taylor expansions. Using multi-index 
notation, we can write the Taylor expansion of /j around Xq 
as 
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If q is an even integer, we can now rewrite a q order Taylor 
expansion as 
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where Q^ is a ("T^) x ("l^j-^ymnietric matrix, and x 
contains all the monomials of the elements of x with degree 

<'Z/2. 

Example 1: Let x = [xi X2\^ and f{x) — l + xi+x\. We 
consider a 4th order Taylor expansion around Xq — 0. Then 
the set of 2-tuple a's is 

A = {(0,0), (1,0), (0,1), (1,1), (2,0), (0,2), (2,1), (1,2), 
(3,0), (0,3), (2,2), (3,1), (1,3), (4,0), (0,4)}. 



We can easily verify that |^| = ("+'^) 



Hence, we can rewrite f{x) as 
t 
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Again, we verify that the dimension of x is equal to ( ^ ) = 6. 

From the example it is easy to see that elements of x are 
generally dependent. The dependencies between elements of 
X needs to be made explicit. 

Example 2: Let, as in the previous example 



a; = [1 xi X2 2;ia;2 



(9) 



It is clear that e.g., the second element xi times the third 
element X2 gives the third element a;ia::2 of x. This can be 
expressed as 
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Constructively, the dependences between elements can be 



generated as follows. Let Q 
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be a ("!*) X ("t») 



1 and set all other 
1. fc = / = 2 and 



symmetric matrix with (5jy_|_i(1, 1) 
elements to zero. Set j/at+i — \, i 
m^ N + 2. 

1) If k ^ I and x{i) = x{k)x{l), let Q„ be a ("+^) x 

2 

(""g^) -symmetric matrix with Q.^^-^{k^l) = Q^^-^{l^k) = 

1/2, Q„(l,z) = Q„,(^,l) = -1/2, set all other 
elements to zero, Hm = and set m — m + 1. 

2) If k — I and x{i) — x{k)x{l), let (5„j be a 

("g^) X (" , 2~j.syjjjjjjei-fi(; matrix with Qj^{l^l) — 1, 

2 2 

Q^{l,i) — Q,„(J, 1) = —1/2, set all other elements to 

zero, Hm — ^ and set rn = m + 1. 

3) If fc < ("t^), set fc = fc + 1 and return to step 1, 

2 
otherwise continue to the next step. 

4) If I < ("2^), set Z = ; + 1, fc = 2 and return to step 1, 

2 
otherwise continue to the next step. 

5) If i < ("t^), set i = i + 1, fc = / = 2 and return to 

2 
step 1. Otherwise set M = m — 1, return {(j/i, Qi)}.fii 

and abort. 



The dependencies can now be expressed as 

y^=xQ^x, i = N + 1,...,M. 



(10) 



Note that these constraints are all quadratic. 

Assuming that the qth order Taylor expansion is a good 
approximation for {/i(a;)}fli, then the problem of finding 
the sparsest solution to Q can be written as: 
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Next, we employ a lifting technique used extensively in 
quadratic programming |37l, ll25l . ||29l , |fT9 l and define a 
positive semi-definite matrix xx^ = X, which satisfies, for 
i = 1, . . . , M, the following equalities 



y^ 



x^Q.x = Ti{x'^Q,x) = TiiQiXx'^) = Tr(Q,X) 
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The nonconvex problem in ([TT} can now be shown equivalent 
to 

l^llo 
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subj. to y, = Tr(Q,X), z = 1, . , 
rank(X) = 1, X h 0. 

Due to the £o-norm function and the rank condition in 
([TSj, the problem is still combinatorial. To relax the ^g-norm 
function in ( [T3] l, we can replace the £o-norm with the li- 
norm. To relax the rank condition, we can remove the rank 
constraint and instead minimize rank(X) in the objective 
function. Furthermore, since the rank of a matrix is still not 
a convex function, we choose the replace the rank with the 
nuclear norm of X, which is known to be a convex heuristic 
of the rank function |18|, |10|. For a semidefinite matrix X, 
it is also equal to the trace of X. Note that there are several 
other heuristics for the rank. For example, the log det heuristic 
described in ifTSl is an interesting alternative (see also ll36l ). 
However, we choose to use the nuclear norm heuristic here 
even though our theoretical results also holds for e.g., the 
log dot heuristic. We finally obtain the convex program 
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subj. to y, = Tr(Q,X), 
XhO, 

where A > is a design variable. We refer to ^\A\ as nonlinear 
basis pursuit (NLBP). 

IV. Theoretical Analysis 

NLBP is a convex relaxation of the combinatorial problem 
given in ( [TT| ). It is of interest to know when the relaxation 
is tight. As a special case, when the degree of the analytic 
functions fi{x) are no greater than two, the problem is 
known as quadratic basis pursuit (QBP) |32|, [31j|. In fact, 
the underlying optimization problem of NLBP is the same 
as that of QBP. Therefore, the theoretical analysis about the 
performance guarantees of QBP also applies to NLBP in this 
paper In this section, we only highlight several key results 
that extend from the proofs given in ll32l . OTl . 

First, it is convenient to introduce a linear operator B: 
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We consider a generalization of the restricted isometry prop- 
erty (RIP) of the linear operator B. 

Definition 1 (RIP): A linear operator B{-) as defined in ( fTS] ) 
is (e, fc)-RIP if 

\B{x)r ^ 



< e 



(16) 



for all ||X||o < fc and X 7^0. 

We can now state the following theorem: 

Theorem 2 (Guaranteed recovery using RIP): Let x be 

the solution to ( [TT| ). The solution of NLBP X is equal to xx^ 
if it has rank 1 and B{-) is (e, 2||X||o)-RIP with e < 1. 

The RIP-type argument may be difficult to check for a given 
matrix and are more useful for claiming results for classes 
of matrices/linear operators. For instance, it has been shown 



that random Gaussian matrices satisfy the RIP with high 
probability. However, given realization of a random Gaussian 
matrix, it is indeed difficult to check if it actually satisfies the 
RIP 

In the literature, there exist two alternative arguments, 
namely, the spark condition II13II and the mutual coherence 
lfT6ll . The spark condition usually gives tighter bounds but is 
known to be difficult to compute as well. On the other hand, 
mutual coherence may give less tight bounds, but is more 
tractable. Next, we focus our discussion on mutual coherence, 
which is defined as: 

Definition 3 (Mutual coherence): For a matrix A, define 
the mutual coherence as 



/i(A) = max 
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(17) 



Let B be the matrix satisfying y = BX" = B{X) with 
JC" being the vectorized version of X. We are now ready to 
state the following theorem: 

Theorem 4 (Recovery using mutual coherence): Let x be 

the solution to ( [TT| ). The solution of NLBP, X, is equal to 
xx'^ if it has rank 1 and ||X||o < 0.5(1 + 1/a*(B)). 

V. Numerical Solvers 

Naturally, efficient numerical solvers that implement non- 
linear basis pursuit are desirable. There are many classes of 
methods, commonly used in compressed sensing, which can be 
used to solve non-smooth SDPs. Among these include interior 
point methods, which is used in the popular software package 
CVX 1 20 1, gradient projection methods |2|, and augmented 
Lagrangian methods (ALM) |2|, outer approximation methods 
|22J, and the alternating directions method of multipliers 
(ADMM), see for instance H, lH. 

Interior point methods are known generally to scale poorly 
to moderate- or large-scale problems. Gradient projection 
methods require a projection onto the feasible set; for our 
problem, this feasible set is the intersection of a subspace 
with the positive semidefinite cone. The complexity of this 
projection operator limits the benefits of using a gradient 
projection method. For ALM, the augmented primal and dual 
objection functions are still SDPs, which are equally difficult 
to solve in each iteration as the original problem. Also, we 
found that outer approximation methods converge very slowly. 

On the other hand, ([T4]i decomposes nicely into the ADMM 
framework. That is, we can define the following functions: 



hiiX) 



Tr(X) if y, = Tr(Q,X), t = I, 
cx) otherwise 
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if X ^ 
oo otherwise 

9iZ) = X\\Z\\^ 

Then, ([T4| becomes: 
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min /^l(Xl) + /^2(X2)+.g(Z) 
subj. to Xi = X2 = Z 



which has the form that ADMM applies to. ADMM has 
strong convergence results, converges quickly in practice, and 
scales well to large data sets |6|. For more details of this 
implementation, we refer the reader to ll32l . ifSTl . 

VI. Numerical Evaluation 
A. A Simple Example 

Let n = 5, A^ = 50 and consider 

y,= ^ q^x", i = l,...,iV, a;eM", (20) 

4>|a|>0 

{q^}4>|Q|>o was generate by sampling from a Gaussian 
distribution and x was zero except for two elements that both 
were one. A mote Carlo simulation consisting of 100 trials 
was performed. We compared NLBP (fourth order Taylor ex- 
pansion), QBP (second order Taylor expansion), and LASSO 
(first order Taylor expansion). a;o = was used. The results 
are given in Table |l] 

TABLE I 
Success rates of NLBP, QBP and LASSO for TV = 50 and n = 5. 



Method 



NLBP QBP LASSO 



Success rates 



100% 



74% 



0% 



B. Polynomial Equation System 

In this example we consider the problem of finding the 
dense x that solves a system of 4th order polynomials 



y^ 



J2 9"= 
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(21) 



given the coefficients {'7Q,}4>|a|>o- We will let iV = 60, n = 5 
and generate x by sampling from a Gaussian distribution with 
a standard deviation of 10. Figure [T| shows the box plots of 
the squared residuals from a monte Carlo simulation consisting 
of 100 trials. New polynomial coefficients were generated at 
random for each trial and a new x. NLBP found the true 
solution (within machine precision) in 99 out of the 100 trials. 
QBP never found the coiTect solution. Both used A = 0. 
LASSO with A = (the least squares estimate) did not give 
a satisfactory estimate. 

VII. Conclusion and Future Work 

The main contribution of this paper is a nonlinear compres- 
sive sensing algorithms based on convex relaxations. The algo- 
rithm, referred to as nonlinear basis pursuit, is rather general in 
that it applies to any analytic nonlinearity by approximate it by 
a Taylor expansion of desired order. Nonlinear basis pursuit 
inherits theoretical guarantees, such as guaranteed recovery 
etc from its linear relative (basis pursuit) and therefore comes 
with theoretical guarantees that greedy algorithms often lack. 
Nonlinear basis pursuit takes the form of a convex non-smooth 
SDP which can be solved using conventional software for 
problems of interest. 

It should be noticed that solving a nonlinear equation system 
is by itself a difficult problem. It is therefore quiet remarkable 



Fig. 1 . Box plot comparing NLBP and QBP for finding the dense solution of 
polynomial equation systems. The box plot summarizes the result from 100 
trials. 



that we with rather high succes rate manage to find the 
sparsest solution to the nonlinear equation system. In addition, 
solving an overdetermined nonlinear equation system is also 
difficult. As shown in the example section, nonlinear basis 
pursuit not only find sparse solutions but also can also provide 
dense solutions to nonlinear equation system when no sparse 
solutions is available. 

Convexifying nonlinear contraints using SDPs via its Taylor 
expansion is up to our knowledge novel and should find 
applications in many areas. This is seen as future research. 
Last, we have not discussed noise in this paper However, this 
extension is trivial and a nonlinear extension of basis pursuit 
denoising follows. 
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